# Convolutional Vision Transformer
Cvt 21 384
Apache-2.0
CvT-21 is an image classification model based on the Convolutional Vision Transformer architecture, pretrained on the ImageNet-1k dataset at a resolution of 384x384.
Image Classification
Transformers

C
microsoft
29
1
Cvt 13 384
Apache-2.0
CvT-13 is a vision transformer model pre-trained on the ImageNet-1k dataset, improving the performance of traditional vision transformers by introducing convolutional operations.
Image Classification
Transformers

C
microsoft
27
0
Featured Recommended AI Models